Rank | Count | Beginning |
---|---|---|
26923 | 2489 | Юханшыв |
18805 | 575 | Совет |
16110 | 521 | Почтă |
4985 | 323 | Вăл |
10772 | 317 | Ку |
962 | 275 | Çак |
24676 | 244 | Чăваш |
10973 | 243 | Кӳлĕ |
12566 | 222 | Мăн |
21782 | 192 | Унăн |
3576 | 174 | Анчах |
539 | 173 | Çав |
9051 | 155 | Кĕçĕн |
23376 | 148 | Халăх |
580 | 135 | Çавăн |
14960 | 133 | Пĕрремĕш |
17414 | 121 | Район |
21781 | 119 | Ун |
23553 | 118 | Халĕ |
24141 | 117 | Хула |
5613 | 116 | Вĕсем |
23721 | 116 | Хальхи |
36 | 115 | Ăна |
29502 | 115 | Ял |
11300 | 114 | Кун-çулĕ |
25268 | 114 | Чи |
11466 | 99 | Кунта |
20132 | 96 | Тĕп |
16942 | 95 | Пурнăçĕ |
7327 | 90 | Департамент |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV